NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Scalable Spectral Clustering with Group Fairness Constraints

Ji Wang; Ding Lu; Ian Davidson; Zhaojun Bai (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)
Ruiz, Francisco; Dy, Jennife; van de Meent, Jan-Willem (Ed.)
There are synergies of research interests and industrial efforts in modeling fairness and correcting algorithmic bias in machine learning. In this paper, we present a scalable algorithm for spectral clustering (SC) with group fairness constraints. Group fairness is also known as statistical parity where in each cluster, each protected group is represented with the same proportion as in the entirety. While FairSC algorithm (Kleindessner et al., 2019) is able to find the fairer clustering, it is compromised by high computational costs due to the algorithm’s kernels of computing nullspaces and the square roots of dense matrices explicitly. We present a new formulation of the underlying spectral computation of FairSC by incorporating nullspace projection and Hotelling’s deflation such that the resulting algorithm, called s-FairSC, only involves the sparse matrix-vector products and is able to fully exploit the sparsity of the fair SC model. The experimental results on the modified stochastic block model demonstrate that while it is comparable with FairSC in recovering fair clustering, s-FairSC is 12× faster than FairSC for moderate model sizes. s-FairSC is further demonstrated to be scalable in the sense that the computational costs of s-FairSC only increase marginally compared to the SC without fairness constraints.
more » « less
Full Text Available
Deep Learning for Prognosis Using Task-fMRI: A Novel Architecture and Training Scheme

https://doi.org/10.1145/3534678.3539362

Shi, Ge; Jason Smucny; Ian Davidson (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022)

Full Text Available
Behavioral differences: insights, explanations and comparisons of French and US Twitter usage during elections

https://doi.org/10.1007/s13278-019-0611-9

Ian Davidson, Antoine Gourru (January 2020, Social network analysis and mining)

Social networks and social media have played a key role for observing and influencing how the political landscape takes shape and dynamically shifts. It is especially true in events such as national elections as indicated by earlier studies with Facebook (Williams and Gulati, in: Proceedings of the annual meeting of the American Political Science Association, 2009) and Twitter (Larsson and Moe in New Med Soc 14(5):729–747, 2012). Not surprisingly in an attempt to better understand and simplify these networks, community discovery methods have been used, such as the Louvain method (Blondel et al. in J Stat Mechanics Theory Exp 2008(10):P10008, 2008) to understand elections (Gaumont et al. in PLoS ONE 13(9):e0201879, 2018). However, most community-based studies first simplify the complex Twitter data into a single network based on (for example) follower, retweet or friendship properties. This requires ignoring some information or combining many types of information into a graph, which can mask many insights. In this paper, we explore Twitter data as a time-stamped vertex- labeled graph. The graph structure can be given by a structural relation between the users such as retweet, friendship or fol- lower relation, whilst the behavior of the individual is given by their posting behavior which is modeled as a time-evolving vertex labels. We explore leveraging existing community discovery methods to find communities using just the structural data and then describe these communities using behavioral data. We explore two complimentary directions: (1) creating a taxonomy of hashtags based on their community usage and (2) efficiently describing the communities expanding our recently published work. We have created two datasets, one each for the French and US elections from which we compare and contrast insights on the usage of hashtags.
more » « less
Full Text Available

Search for: All records